- What is a tree?
- How is a tree built?
- What are phylogenetic data?
7.17.19
Skink tree from Wright et al. 2015
Dolphin, Alex Vasenin via WikiMedia
Ask a Biologist illustration of homology
Taxonomy
Hennig, 1950 Grundzüge einer Theorie der Phylogenetischen Systematik
Taxonomy
Hennig, 1950 Grundzüge einer Theorie der Phylogenetischen Systematik
Sneath & Sokal, 1963, 1973
library(phytools)
## Loading required package: ape
## Loading required package: maps
tree <- pbtree(n = 5) plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5)
Tip: What we are putting on the tree. May be species, individuals, or higher-order taxa. May be called terminal node, leaf, one degree node.
plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5)
tree$tip.label
## [1] "t4" "t5" "t2" "t3" "t1"
[1] "t4" "t5" "t1" "t2" "t3"
Access in R: tree$tip.label
plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5) nodelabels(cex=4)
Node: Where nodes meet, implying a most recent common ancestor. May be called vertex, or three-degree node.
library(ape) tree$Nnode
## [1] 4
[1] 4
getMRCA(tree, c("t1", "t2"))
## [1] 8
[1] 6
#plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5) tree$edge
[,1] [,2] [1,] 6 7 [2,] 7 8 [3,] 8 1 [4,] 8 2 [5,] 7 3 [6,] 6 9 [7,] 9 4 [8,] 9 5
Branch: What connects the tip to the tree. Can have a variety of units, which we will discuss over the next few days. May be called edge. Access in R: tree$edge
library(phytools) plotBranchbyTrait(tree,tree$edge.length,method="tips")
tree$edge.length
## [1] 0.192155190 0.357275338 0.357275338 0.001372682 0.091316181 0.456741666 ## [7] 0.456741666 0.548057846
[1] 0.01070042 1.00622147 0.36177554 0.36177554 1.36799701 0.23368315 [7] 1.14501427 1.14501427
plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5, direction = "downwards")
plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5, type="fan")
plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5) nodelabels(cex = 3.5)
rotateNodes(tree, c(7, 8))
## ## Phylogenetic tree with 5 tips and 4 internal nodes. ## ## Tip labels: ## [1] "t5" "t4" "t1" "t2" "t3" ## ## Rooted; includes branch lengths.
Phylogenetic tree with 5 tips and 4 internal nodes. Tip labels: [1] "t1" "t5" "t4" "t2" "t3" Rooted; includes branch lengths.
plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5)
is.monophyletic(tree, c("t1", "t2"), plot = TRUE, edge.width = 1.5, cex = 3.5, no.margin = TRUE)
## [1] FALSE
[1] FALSE
# reroot(tree, node.number) plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5)
Ingroup: Taxa of interest
Outgroup: Taxon closely related used to root the tree
unroot_tree <- unroot(tree) plot(unroot_tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5)
library(alignfigR)
## Welcome to alignfigR!
char_data <- read_alignment("../../data/bears_fasta.fa")
char_data[1:3]
## $Agriarctos_spp ## [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "0" ## [18] "0" "0" "1" "1" "1" "1" "0" "0" "1" "?" "1" "1" "?" "0" "1" "1" "1" ## [35] "1" "0" "1" "1" "0" "?" "?" "0" "1" "1" "1" "0" "?" "?" "?" "?" "?" ## [52] "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" ## ## $Ailurarctos_lufengensis ## [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" ## [18] "0" "0" "1" "1" "1" "1" "0" "1" "1" "?" "1" "1" "?" "0" "?" "?" "?" ## [35] "?" "0" "1" "1" "1" "?" "0" "0" "1" "1" "1" "0" "1" "0" "1" "1" "0" ## [52] "1" "1" "?" "?" "?" "?" "?" "?" "?" "?" "?" ## ## $Ailuropoda_melanoleuca ## [1] "1" "0" "1" "1" "1" "1" "0" "1" "1" "0" "1" "0" "0" "1" "0" "0" "0" ## [18] "0" "0" "1" "1" "1" "1" "0" "1" "0" "1" "1" "1" "0" "0" "1" "0" "1" ## [35] "0" "0" "1" "1" "0" "0" "0" "0" "1" "1" "1" "0" "1" "0" "0" "1" "0" ## [52] "1" "1" "0" "0" "0" "1" "0" "0" "0" "1" "0"
$Agriarctos_spp [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "0" [18] "0" "0" "1" "1" "1" "1" "0" "0" "1" "?" "1" "1" "?" "0" "1" "1" "1" [35] "1" "0" "1" "1" "0" "?" "?" "0" "1" "1" "1" "0" "?" "?" "?" "?" "?" [52] "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" $Ailurarctos_lufengensis [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" [18] "0" "0" "1" "1" "1" "1" "0" "1" "1" "?" "1" "1" "?" "0" "?" "?" "?" [35] "?" "0" "1" "1" "1" "?" "0" "0" "1" "1" "1" "0" "1" "0" "1" "1" "0" [52] "1" "1" "?" "?" "?" "?" "?" "?" "?" "?" "?" $Ailuropoda_melanoleuca [1] "1" "0" "1" "1" "1" "1" "0" "1" "1" "0" "1" "0" "0" "1" "0" "0" "0" [18] "0" "0" "1" "1" "1" "1" "0" "1" "0" "1" "1" "1" "0" "0" "1" "0" "1" [35] "0" "0" "1" "1" "0" "0" "0" "0" "1" "1" "1" "0" "1" "0" "0" "1" "0" [52] "1" "1" "0" "0" "0" "1" "0" "0" "0" "1" "0"
library(alignfigR)
char_data <- read_alignment("../../data/bears_fasta.fa")
char_data[1:3]
## $Agriarctos_spp ## [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "0" ## [18] "0" "0" "1" "1" "1" "1" "0" "0" "1" "?" "1" "1" "?" "0" "1" "1" "1" ## [35] "1" "0" "1" "1" "0" "?" "?" "0" "1" "1" "1" "0" "?" "?" "?" "?" "?" ## [52] "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" ## ## $Ailurarctos_lufengensis ## [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" ## [18] "0" "0" "1" "1" "1" "1" "0" "1" "1" "?" "1" "1" "?" "0" "?" "?" "?" ## [35] "?" "0" "1" "1" "1" "?" "0" "0" "1" "1" "1" "0" "1" "0" "1" "1" "0" ## [52] "1" "1" "?" "?" "?" "?" "?" "?" "?" "?" "?" ## ## $Ailuropoda_melanoleuca ## [1] "1" "0" "1" "1" "1" "1" "0" "1" "1" "0" "1" "0" "0" "1" "0" "0" "0" ## [18] "0" "0" "1" "1" "1" "1" "0" "1" "0" "1" "1" "1" "0" "0" "1" "0" "1" ## [35] "0" "0" "1" "1" "0" "0" "0" "0" "1" "1" "1" "0" "1" "0" "0" "1" "0" ## [52] "1" "1" "0" "0" "0" "1" "0" "0" "0" "1" "0"
$Agriarctos_spp [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "0" [18] "0" "0" "1" "1" "1" "1" "0" "0" "1" "?" "1" "1" "?" "0" "1" "1" "1" [35] "1" "0" "1" "1" "0" "?" "?" "0" "1" "1" "1" "0" "?" "?" "?" "?" "?" [52] "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" $Ailurarctos_lufengensis [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" [18] "0" "0" "1" "1" "1" "1" "0" "1" "1" "?" "1" "1" "?" "0" "?" "?" "?" [35] "?" "0" "1" "1" "1" "?" "0" "0" "1" "1" "1" "0" "1" "0" "1" "1" "0" [52] "1" "1" "?" "?" "?" "?" "?" "?" "?" "?" "?" $Ailuropoda_melanoleuca [1] "1" "0" "1" "1" "1" "1" "0" "1" "1" "0" "1" "0" "0" "1" "0" "0" "0" [18] "0" "0" "1" "1" "1" "1" "0" "1" "0" "1" "1" "1" "0" "0" "1" "0" "1" [35] "0" "0" "1" "1" "0" "0" "0" "0" "1" "1" "1" "0" "1" "0" "0" "1" "0" [52] "1" "1" "0" "0" "0" "1" "0" "0" "0" "1" "0"
These data are binary
library(alignfigR)
char_data <- read_alignment("../../data/bears_fasta.fa")
char_data[1:3]
## $Agriarctos_spp ## [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "0" ## [18] "0" "0" "1" "1" "1" "1" "0" "0" "1" "?" "1" "1" "?" "0" "1" "1" "1" ## [35] "1" "0" "1" "1" "0" "?" "?" "0" "1" "1" "1" "0" "?" "?" "?" "?" "?" ## [52] "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" ## ## $Ailurarctos_lufengensis ## [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" ## [18] "0" "0" "1" "1" "1" "1" "0" "1" "1" "?" "1" "1" "?" "0" "?" "?" "?" ## [35] "?" "0" "1" "1" "1" "?" "0" "0" "1" "1" "1" "0" "1" "0" "1" "1" "0" ## [52] "1" "1" "?" "?" "?" "?" "?" "?" "?" "?" "?" ## ## $Ailuropoda_melanoleuca ## [1] "1" "0" "1" "1" "1" "1" "0" "1" "1" "0" "1" "0" "0" "1" "0" "0" "0" ## [18] "0" "0" "1" "1" "1" "1" "0" "1" "0" "1" "1" "1" "0" "0" "1" "0" "1" ## [35] "0" "0" "1" "1" "0" "0" "0" "0" "1" "1" "1" "0" "1" "0" "0" "1" "0" ## [52] "1" "1" "0" "0" "0" "1" "0" "0" "0" "1" "0"
$Agriarctos_spp [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "0" [18] "0" "0" "1" "1" "1" "1" "0" "0" "1" "?" "1" "1" "?" "0" "1" "1" "1" [35] "1" "0" "1" "1" "0" "?" "?" "0" "1" "1" "1" "0" "?" "?" "?" "?" "?" [52] "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" $Ailurarctos_lufengensis [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" [18] "0" "0" "1" "1" "1" "1" "0" "1" "1" "?" "1" "1" "?" "0" "?" "?" "?" [35] "?" "0" "1" "1" "1" "?" "0" "0" "1" "1" "1" "0" "1" "0" "1" "1" "0" [52] "1" "1" "?" "?" "?" "?" "?" "?" "?" "?" "?" $Ailuropoda_melanoleuca [1] "1" "0" "1" "1" "1" "1" "0" "1" "1" "0" "1" "0" "0" "1" "0" "0" "0" [18] "0" "0" "1" "1" "1" "1" "0" "1" "0" "1" "1" "1" "0" "0" "1" "0" "1" [35] "0" "0" "1" "1" "0" "0" "0" "0" "1" "1" "1" "0" "1" "0" "0" "1" "0" [52] "1" "1" "0" "0" "0" "1" "0" "0" "0" "1" "0"
Always arranged with rows being taxa and columns corresponding to a character - “matrix” structure
Text editor - phylo data, metadata
DNA data tends to be simple
Example character from Brady:
image via Ask a Biologist
How do we know we’ve captured the relevant character axes?
image via Ask a biologist, Mike Hagelberg
library(ggplot2)
colors <- c("blue", "purple","white")
plot_alignment(char_data, colors, taxon_labels = TRUE) + theme(text = element_text(size=40))
library(ggplot2)
colors <- c("blue", "purple","white")
plot_alignment(char_data, colors, taxon_labels = TRUE) + theme(text = element_text(size=40))
How do we go from this to a tree?
??? Have them start installs on the next page while we do this.
RStudio –or–Shiny
library(treesiftr)
## Registered S3 method overwritten by 'treeio': ## method from ## root.phylo ape
## ## Attaching package: 'treesiftr'
## The following object is masked _by_ '.GlobalEnv': ## ## tree
aln_path <- "../../data/bears_fasta.fa"
bears <- read_alignment(aln_path)
bear_tree <- multi2di(read.tree("../../data/starting_tree.tre"))
sample_df <- generate_sliding(bears, start_char = 1, stop_char = 5, steps = 1)
print(sample_df)
## starting_val stop_val step_val ## 1 1 2 1 ## 2 2 3 1 ## 3 3 4 1 ## 4 4 5 1 ## 5 5 6 1
starting_val stop_val step_val 1 1 2 1 2 2 3 1 3 3 4 1 4 4 5 1 5 5 6 1
library(phangorn) library(ggtree)
## ggtree v1.16.3 For help: https://yulab-smu.github.io/treedata-book/ ## ## If you use ggtree in published research, please cite the most appropriate paper(s): ## ## [36m-[39m Guangchuang Yu, Tommy Tsan-Yuk Lam, Huachen Zhu, Yi Guan. Two methods for mapping and visualizing associated data on phylogeny using ggtree. Molecular Biology and Evolution 2018, accepted. doi: 10.1093/molbev/msy194[36m-[39m Guangchuang Yu, David Smith, Huachen Zhu, Yi Guan, Tommy Tsan-Yuk Lam. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods in Ecology and Evolution 2017, 8(1):28-36, doi:10.1111/2041-210X.12628
## ## Attaching package: 'ggtree'
## The following object is masked from 'package:phytools': ## ## read.newick
## The following object is masked from 'package:ape': ## ## rotate
output_vector <- generate_tree_vis(sample_df = sample_df, alignment = aln_path,
tree = bear_tree,
phy_mat = bears, pscore = TRUE)
## Generating tree for charset:12
## Final p-score 2 after 0 nni operations
## Generating tree for charset:23
## Final p-score 2 after 0 nni operations
## Generating tree for charset:34
## Final p-score 2 after 0 nni operations
## Generating tree for charset:45
## Final p-score 2 after 0 nni operations
## Generating tree for charset:56
## Final p-score 2 after 0 nni operations
output_vector #sample output - you will get more than this when you run in your console
## [[1]] ## NULL ## ## [[2]] ## NULL ## ## [[3]] ## NULL ## ## [[4]] ## NULL ## ## [[5]]
Parsimony Trees
This is one character. Imagine many - enumeration is not possible.Also note that several trees have the same “best” tree
Image via Mark Holder
execute data/bears_morphology.nex
cstatus tstatus showmatrix showdist log file="mylogfile"
alltrees
What happened here?
Parsimony Trees
Heuristic - use of shortcuts to reduce the number of trees we need to search
hsearch
Heuristic - use of shortcuts to reduce the number of trees we need to search
hsearch swap = nni
Heuristic - use of shortcuts to reduce the number of trees we need to search
hsearch swap = spr
savetrees from=1 to=1 file=results/tree1.tre; savetrees from=2 to=2 file=results/tree2.tre; savetrees from=3 to=3 file=results/tree3.tre;